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PACS 87.15.H — Dynamics of biomolecules 

PACS 82 . 20 . Pm - Rate constants, reaction cross sections, and activation energies 
PACS 87.14.gk-DNA 

Abstract - The dynamics of the DNA denaturation is studied using the Peyrard-Bishop-Dauxois 
model. The denaturation rate of double stranded polymers decreases exponentially as function 
of length below the denaturation temperature. Above Tc, the rate shows a minimum, but then 
increases as function of length. We also examine the influence of sequence and solvent friction. 
Molecules having the same number of weak and strong base-pairs can have significantly different 
opening rates depending on the order of base-pairs. 



^ , For decades, experimental and theoretical scientists 
"^have been fascinated by the thermal DNA denatura- 
J>~i,ion [T]. It is biologically relevant since the opening of 
the double helix in an important step for the transcrip- 
i_^]tion of the genetic code [2] but it is also becoming im- 
portant for nanotcchnology as DNA is now used for its 
^-H self-assembly properties [3] , to create nanodevices [J or 
^tfo design molecular memories [S]. The different AT and 
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GC base-pairs (bps) arc intra-connectcd by two and three 
{y-^ hydrogen-bonds, respectively. Denaturation experiments, 
ip which UV absorbance is measured as function of tem- 
^Iperature, show discrete steps associated to the sequential 
opening of soft and stiff regions in the DNA sequence. 
These regions mainly differ by their density of weak AT- 
yersus strong GC-bps, but also the specific order is im- 

* * 

J> portant [6] . On the other hand, large homogeneous chains 
k>( denature in a single step within a very small tempera- 
rS tfure interval, a remarkable realization of an effectively 
Onc-dimcnsional phase transition [7] . 

Theoretically, the statistics of DNA have been modeled 
via Ising type models [S] in which each bp is given a value 
or 1 depending on whether it is open or closed. This 
approach allows the study of extremely large sequences 
but cannot be used to study dynamics. This has been 
the principle motivation for Peyrard, Bishop and Daux- 
ois to develop a continuous model (PBD) [9]. The PBD 
model is computationally somewhat more expensive than 
the Ising type models, but still quite efficient due to its 
mesoscopic character. As a result, the PBD model is very 
well suited to study dynamics of reasonably long DNA 



molecules at timescales that are not reachable with full- 
atom simulations [TU]. The PBD model, therefore, meets 
our requirements that we need to study the dynamics of 
the DNA denaturation transition. In specific, wc want to 
understand how the rate of full denaturation depends on 
several parameters such as length, sequence, solvent fric- 
tion, and temperature. 

The PBD model has revealed a rich spectrum of dy- 
namical phenomena such a the existence of nonlinear lo- 
calized modes [TT]. Ironically, most studies on the PBD 
model that apply specific types of dynamics, such as Nose- 
Hoover, Brownian motion, and Langevin (sometimes even 
using a configurational dependent friction parameter |12j). 
have reported on equilibrium properties like the fraction 
of open bp's at a given temperature. These properties, in 
principle, do not depend on the dynamics |13j . So far, the 
study on actual dynamics has been limited to the local 
opening of DNA [15] or denaturation at elevated temper- 
atures [12 . For applications such as nanodevices [3] on 
molecular memories [5], the timing becomes important. 

The dynamics of full denaturation is difficult to access 
using standard MD as it is a rare event on the timcscalc 
that is achievable by molecular simulations. Transition 
interface sampling (TIS) |16j and the more recent Replica 
Exchange TIS [17] are powerful techniques to beat this 
timcscale problem. However, a very efficient implementa- 
tion of the reactive flux (RF) method [TB] can treat the 
PBD model even more efficiently [TT]- The RF method 
starts from the Transition State Theory (TST) expression, 
but corrects for correlated recrossings via a dynamical 
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factor, the transmission coefficient. Unlike TST, the RF 
method provides an exact expression for the rate constant 
that is independent to the choice of reaction coordinate 
(RC). In Ref. [T7|, we showed how the free energy and the 
transmission coefficient can be calculated very efficiently 
by exploiting specific peculiarities of the PBD model. Us- 
ing this approach, we can achieve rates far below 10~^^ 
ns~^ or, in other words, relaxation times of several days 
or even years. We are not aware of any mcsoscopic model- 
ing approach that is able to reach such timescales. In this 
letter, we apply this method to determine the dependence 
of the denaturation rate of double stranded DNA as func- 
tion of its length, the temperature and solvent friction, 
the content of weak AT versus strong CG bp's, and their 
specific order. 

The PBD describes the DNA molecule as an one- 
dimensional chain of effective atom compounds yielding 
the relative base-pair separations yi from the ground state 
positions. The total potential energy U for an N base-pair 
DNA chain is given by = Vi{yi) + y^iy^) + 

W{yi,yi-i) with 

V^,(y.)-A(e-"-^'-i)' (1) 
W{y,,y,^^) = i/^(l + pe-"(^'+^'-))(y,-2;,_i)2 

The first term Vi is the Morse potential describing the 
hydrogen bond interaction between bases on opposite 
strands. Di and determine the depth and width of 
this potential for the AT and GC base-pairs. Note that 
the PBD model does not distinguish between A- and T- 
nor between G- and C-bases. The second term W is the 
stacking interaction. The p-term makes that the effective 
strength of the stacking interaction drops from K{1 + p) 
down to K whenever either or yi-i becomes signifi- 
cantly larger than 1/a. This effect mimics the decrease of 
overlap between tt- electrons when one of two neighboring 
bases move out of stack and it is thanks to this that the 
sharp phase transition in long homogeneous chains can be 
reproduced. 

Our results are primary focused on the parameter set 
of Campa and Giansanti [19] with K = 0.025 eV/A^, 
p = 2, a ^ 0.35 A-i, Dat = 0.05 eV, Dae = 0.075 
eV, aAT = 4.2 A^^, ace ~ 6.9 A^^. However, we wiU 
also shortly investigate the more recent parameter set by 
Theodorakopoulos K = 0.00045 eV/A^, p = 50, 

a = 0.2 A-i, Dat = 0.1255 eV, Dgc = 0.1655 eV, 
aAT = 4.2 A"-"^, ace = 6.9 A~^. Particular of this data-set 
is the very high p value which should reflect the large dif- 
ference in persistence length of single stranded and double 
stranded DNA 21]. A = min[{?/i}] was chosen as reaction 
coordinate RC [17] and yo = 1 A as the opening threshold. 
Henceforth, yi > yg implies that base-pair i is open and 
A > j/o that the complete molecule is denatured. 

The RF method expresses the overall reaction rate as an 
equilibrium probability density to be at a surface on the 
barrier (here defined by X{{yi}) = ya), under the condition 



that the system is at the reactant side of this surface, times 
a dynamical transmission factor. 

fc = P(A = yo\X <yo)xR with R = (x9{X)h'^ A 

\ ' I \=ya 

(2) 

The dynamical factor R is called is the unnormalized 
transmission coefficient. The brackets with subscript de- 
note an ensemble average that is constrained on the sur- 
face X{{yi\) = j/o- It is calculated by releasing dynamical 
trajectories forward and backward in time starting from 
a proper equilibrium ensemble of configuration points on 
this surface and Maxwell-Boltzmann distributed veloci- 
ties, /ig is a binary function that is 1 if the backward 
trajectory from such a point evolves to the minimum of 
the Morse potential (A = A) without recrossing the ini- 
tial surface (A = y^). Otherwise it is 0. Since this surface 
is at the frontier of what is considered to be the product 
state, no forward trajectory is needed as in Ref. |53], since 
A({i/i}) = 2/0 and A > implies that the product state will 
be entered within an infinitesimal small time step. This 
procedure is sketched in fig. [1] 

Assuming y^ = 1 whenever A > would result into 
the TST approximation of the reaction rate. In practice, 

vo ~ ^ ^'^'^ ^^^^ majority of trajectories though its 
average value remains measurable for this RC due to the 
flat plateau of the Morse potential. The time duration 
of these trajectories are much shorter than the actual re- 
laxation time that relates to the long time evolution of 
the system having relatively small oscillations. Therefore, 
even though the very large fluctuations of these short tra- 
jectories are at the limit of what the PBD can describe 
accurately, their short duration, with respect to the ac- 
tual relaxation time, make quantitative rate calculations 
reliable. Also the fact that we stop our trajectories when- 
ever yi > yo for all i is reasonable. Besides that the PBD 
model isn't too accurate beyond this point, it is expected 
that the chance of reclosing is even smaller in a more accu- 
rate three-dimensional model. We can, therefore, assume 
that the system is committed to the denatured state be- 
yond this point and it is therefore sufficient to stop the 
trajectory whenever the yo transition surface is crossed. 
Still, it is fair to say that our approach and RC will prob- 
ably not work for more elaborated potentials that contain 
explicit potential barriers for reclosing [53] since it will 
result in i? ^ 1. 

The equilibrium density is related to the free energy via 
F{X) = —kBT\nP{X) with ks the Boltzmann constant 
and T the temperature in Kelvin. The surface does not 
necessarily have to be at the local maximum of the free 
energy profile, the transition state (TS), though this is 
generally the most efficient choice since it maximizes the 
transmission R. In our case the surface X{{yi}) — yo is 
slightly beyond the TS at the beginning of the denatu- 
ration state or product state region. This has, however, 
no effect on the final results since the transmission co- 
efficient and probability density are like communicating 
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Fig. 1: (color online) The above figure illustrates an example 
configuration (solid red spheres) at the surface A({j/i}) = yo 
that was generated by our algorithm for a homogeneous A*' — 
8 AT-chain. It has one particle that is exactly at yo while 
3/i > yo for all others. The arrows envision the corresponding 
Maxwellian velocities. Each of these sampled phases points 
will give a single value x that is either zero or positive. If A 
(which is simply the velocity of the particle that is exactly on 
the surface) equals less than zero, then a; = 0. If A > 0, all 
velocities of the system will be reversed and the dynamics are 
propagated using the Langevin dynamics. These dynamics are 
continued until one particle reaches the well (j/i < for one 
particle or A < 0) or until all particles move beyond the initial 
surface {yi > yo for all particles or A > yo)- In the last case 
X is assigned zero again. In the first case x equals the initial 
velocity A. The final configuration is shown by the green open 
circles. A particle at the end of the chain has moved inside 
the well of the Morse potential (hence x = X for this case). 
This event implies a commitment to the natured state; this 
particle will quickly pull the others inside the well and a rapid 
recrossing with the yo surface is highly unlikely after this point. 
The unnormalized transmission coefficient is the average of the 
sampled values x: R — x. 



vessels; a too high value for P (or too low value of AF) 
is compensated by a lower transmission R. Therefore, the 
calculated rate values are insensitive to the exact location 
of the transmission surface. 

Because of the first neighbor character of the PBD 
model, the free energy or probability density P(A = yo\X < 
yo) can be computed very efficiently by means of an iter- 
ative numerical integration scheme [17] . 
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Now, as the integrals of Eq. ^ are all of a special facto- 
rial form |22| , we can apply the direct integration method 
of Ref. [12] using an integration step of dy = 0.05 A and 
integration boundaries — j/i-i < d~ ^J2\ \nT\/ (3K and 

-l/flATln [ ^\\nT\//3DAT + l\ < y^ < Vo + ^/Nd. The 

tolerance r is set to 10^*° which implies that all contri- 
butions of e~^^'^^y^^^ lower than this value are neglected. 
The very low value of K in the parameter set of [50] 



with respect to |19j implies significant larger integration 
boundaries and a 50 times higher computational cost. 

In the next step, we need to generate a representative 
set of configurations on the surface X{{yi}) = yo- Also 
this can be achieved very effectively for this particular 
model and RC. We make use of the fact that ensemble 
constraint to this surface is identical to a freely mov- 
ing chain on a translational invariant potential U' with 
U'{{y^)) = U{{y^^\{{y,}) + yo}) M- Therefore, we gen- 
erate the required surface points by running a MD simu- 
lation using potential U' , save every 1000th time step to 
dissolve correlations. Then, we shift these configurations 
back to the surface \{{yi\) = yo- From each point, we 
release a backward trajectory using normal potential U 
and determine X9{\)ho^yg- We averaged over 10^ trajecto- 
ries using a time step of 1 fs and bp masses of 300 amu. 
The temperature was controlled using Langevin dynam- 
ics with a friction coefficient of 7 = 50 ps~^ unless stated 
otherwise. 

It is important to stress that factor P is purely a ther- 
modynamic property (i. e. independent of 7 or masses) 
while R depends on the full dynamics. This methodol- 
ogy allows us for the first time to examine the theoretical 
PBD denaturation rates of long DNA molecules as func- 
tion of sequence, solvent friction, temperature, and model 
parameters. 

Fig. 12] shows the calculated denaturation rates of homo- 
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Fig. 2: (color online) Top: Denaturation rate of homogeneous 
AT chains as function of the number of bps at different tem- 
peratures. Inset shows the same results in log-scale. Bottom: 
The corresponding transmission coefficients. 



geneous A/T-DNA sequences of different lengths and tem- 
peratures. Below the critical denaturation temperature 
(Tc ~ 325A') the denaturation rate of the DNA molecule 
is exponentially decreasing as function of its length. Above 
Tc, there is an initial decrease as function of the number of 
bps N , but then starts to increase again. Both the depth 
and the position of the minimum decrease as function of 
temperature. This feature is mainly an effect of equilib- 
rium statistical physics (factor P) rather then dynamics 
(factor R) as we can conclude from the lower panel of 
fig. [2] This panel shows the normalized transmission coef- 
ficient K = R/ ( \6{\) ) = \/2TTl3mR which is equivalent to 
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the true rate constant divided by the transition state the- 
ory (TST) value. The results show that, although k is far 
smaller than one, it is more or less constant with respect 
of temperature and chain length. Only for short polymers 
< 20 there is a noticeable upturn of k. Henceforth, 
despite that TST overestimates the rate by more than a 
factor 40, the relative TST rates are approximately cor- 
rect. 

An increase of denaturation rate as function of its length 
is remarkable even for T > Tc- For macroscopic systems 
having two phases A and B, crossing the phase transi- 
tion temperature coincides with a discontinuous jump of 
the equilibrium constant from zero to infinity. Since the 
equilibrium constant is related to the ratio of forward and 
backward rate constants, kA^B /ks^A goes from zero to 
infinity in the limit A — >■ cxd when T crosses Tc- Still, 
this does not imply that the absolute values of any of the 
two rates should increase as function of A^. In fact, the 
decreasing denaturation time r ~ as function of A^ is 
in contrast to the power-law scaling r oc N'^-^'^ of a non- 
equilibrium case study |14j . 

To examine the effect of temperature on the equilib- 
rium statistics, fig. [3] shows the free energy F{\') = 
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Fig. 3: (color online) The free energy profile as function of 
the RC for homogeneous A/T chains of different lengths and 
temperatures T =300 K (top) and T =350 K (bottom). 

-In [P(A = A'|A < yo)/P°] / fi with P" = 1/A being an 
arbitrary reference density. At temperature T = 300 K 
the free energy barrier is clearly increasing as function of 
chain length while it is slightly decreasing at T = 350 K. 
The longer polymers profit from the larger entropy gain 
at high T. 

In fig. m we examined the effect of heterogeneities 
in the DNA sequences at room temperature conditions 
(T = 300 A'). The red and blue curve represent the denatu- 
ration rate for the homogeneous A/T and G/C-sequences. 
The green curve in the middle corresponds to sequences 
where the first half is purely A/T and the other half G/C. 
All curves become straight lines for sufficiently large A^ 
which implies an exponential dependence on the chain 
length. Linear fits in the log-plot reveal that k sa ab^ 
with a(l/ns),& = 0.240,0.897 for the A/T- and 0.436, 
0.678 for the G/C-chain. The mixed AG-sequence has 
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Fig. 4: (color online) Denaturation rate versus the number 
bps for homogeneous A/T- (red line) and G/C-chains (blue) 
and a sequence having N/2 consecutive A's followed by N/2 
consecutive G's (green). Crosses correspond to the parameter 
set of Ref. [19], triangles correspond to Ref. [20]. Blue line 
indicates the rate (right and top axis) for a mixed 40 bp chain 
having M consecutive A- followed by 40-M G-bps. Dashed 
black lines are best linear fits in the log-plot. 



a = 0.346 and b = 0.778. The latter value is very close to 
^/b{A)Vb{G) = 0.780 which suggest k cx b{A)^-^b{G)^<^ 
as a general rule for a mixed sequence where Na and No 
are the number of A/T- or G/C-bps. The results based on 
the alternative parameter set of Ref. [ID] show the same 
trend but predicts denaturation rates that are orders of 
magnitude lower. The linear fits give a,b = 0.039, 0.639 for 
A/T, 0.048, 0.240 for GC-, and 0.0249, 0.415 f or the mixed 
chain. The latter is again very close to y^o(]4)xl)(G) 
which corroborates with the above. In fig. H] we also show 
the denaturation rate for a polymer of fixed A^ = 40 
length as function of the A,G-content using parameter 
set [19]. Like before, the order of the polymer is such 
that the A- and G-bps are placed consecutively at oppo- 
site sides. Making the linear fit in the log-plot shows that 
k « 7.06(l/ns)1.31^'* which is again in excellent agree- 
ment with the above rule since b{A)/b{G) = 1.32. 

In Fig. [S] we examined the infiuence of the friction con- 
stant 7. As the dynamics do not change the equilibrium 
distribution they can only have an effect on k. In agree- 
ment with previous finding, we see that the transmission 
coefficient is hardly affected by the polymer length, the 
type of bps and temperature. In conclusion, k is mainly 
determined by the friction coefficient 7. At high friction 
we notice a standard Kramer's behavior |25j k cx I/7 
which is somewhat surprising given the exotic nature of 
the RC that depends on all yi coordinates in a discontin- 
uous way. Contrary to Kramer's theory, k converges to 
values in the range 0.38 — 0.48 and does not approach 1 
in the limit 7 — >■ 0. 

Finally, we also examined the infiuence of the order of 
the sequence on the denaturation rate. In fig. [5] we show 
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Fig. 5: (color online) Transmission coefficient versus the fric- 
tion coefficient 7 for a homogeneous A- (solid line) or G- 
(dashed line) at temperatures T = 300 K (spheres) or 350 
K (triangles) consisting of 20 (green) or 40 (blue) bps. Inset 
shows the same results in a log-log plot. Dashed black line 
corresponds to the fit ^(7) = 1.3 ps^^/7. 



normalized denaturation rates as function of for se- 
quences that all contain 50% A- and 50% G-bases. The 
computed rate constant where normalized by the values 
of fig. m Hence, all values are relative to a chain having 
all A's and all G's at one side of the polymer. We call 
this the AAGG-sequence after the A^ = 4 polymer having 
this characteristics. Besides the AAGG-sequence, we ex- 



3.0 


















1-3 














2.6 


1.0 


* i t 


■ 1 










2.2 


0.7 

■ 














3 10 20 


30 40 








1.8 












AGGA - 




1.4 




AGAG — 








GAAG 




1.0 




J- • 










-i 










>-/-- 









20 40 60 80 100 120 140 
N 



Fig. 6: (color online) Relative denaturation rates versus chain 
length for DNA sequences having 50% A- and 50% G-bps. The 
curves only differ in the ordering of bps. All values are normal- 
ized by the chain having all A's at one side and the G's at the 
other side ("AAGG" after the N=4 bp chain). The alternating 
sequence AGAG (red) is shown together with the sequences 
having an A- (blue) or G-block (green) in the middle, respec- 
tively GAAG and AGGA. Full circles with error-bars show the 
results of the full calculation while the solid line is the result 
when transmission coefficients k would be considered to be 
identical for all sequences. Inset shows the same results for the 
alternative parameter set of Ref. [20) . 



amined the alternating sequence AGAG, the "weak-block 
in the middle" sequence GAAG, and the "strong-block in 
the middle" sequence AGGA. The results were computed 



in two ways i) using the full rates, ii) using the proba- 
bilities densities P alone. The second approach basically 
assumes that all transmission coefficients are identical for 
all sequences which seems to be right considering previous 
conclusions and has the advantage that its result is not 
affected by statistical uncertainties. 

Fig. [5] shows that the AGAG sequence gives significant 
faster denaturation compared to the AAGG-chain and this 
effect increases with chain length. The GAAG-chain is 
second best being approximately 13% faster than AAGG 
for iV > 20 without further increase as function of chain 
length. Conversely, the AGGA-scquence is about 10% 
slower than AAGG. Naturally, the bases at the end of 
the polymer are the most easy to open. Apparently, the 
best way to profit from this effect is to put the bases at 
the end that are difficult to open otherwise, but an even 
distribution of the G-bases is by far the best. 

The results of the alternative parameter set, however, 
(inset in fig. ^ seems to give a somewhat different con- 
clusion. The statistical error-bars are much larger for this 
parameter set due to the very weak coupling K which 
increases the accessible phase-space. Still, AGGA and 
GAAG seem to be reversed regardless method and i) or ii) 
is applied. The AGGA seem to denature even faster than 
the AGAG-chain though this might be just a statistical 
fluctuation. If we consider the method ii), we see that the 
red curve starts to grow as function of N and surpasses the 
AGGA result at = 40. Therefore, we believe that the 
alternating sequence has the fastest denaturation irrespec- 
tive of the parameter set for sufficiently large N. Besides 
their interest for studies involving real devices, those re- 
sults on the effect of the sequence are important because 
they provide experimentally testable data |26] that dis- 
criminate between two parameter sets that give very simi- 
lar equilibrium properties. This offers a unique possibility 
to validate model parameters. 

In conclusion, we have utilized an innovative approach 
to study the denaturation rate as function of different pa- 
rameters. This has resulted in several predictions which 
can be tested experimentally. DNA models, giving similar 
results regarding statistics, can differ fundamentally when 
denaturation arc concerned. Therefore, our methodology 
in combination with experiments provides an additional 
dimension to probe the validity of DNA models and to 
improve them. This will be a significant step forward in 
understanding the sequence-dependent dynamics of DNA 
which play a crucial role in several biological processes 
and new DNA-based devices for which the time response 
is important. 

* * * 
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